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@ Construction of DNA sequences and their use for microbial production of proteins, In particular human serum albumin. 

© By means of reverse transcription of mRNA coding 
for a desired polypeptide, there la obtained a aet of 
overlapping fragments of duplex cDNA t which together 
correspond to the whole mRNA molecule. The fragments 
have overlapping regions bearing sites for restriction 
enzymes, such that cutting and ligation gives DNA 
corresponding to the polypeptide. ThJs la Introduced Into 
a vector In reading frame with a promoter. Transforma- 
tion of a microorganism enables expression of the poly- 
Ol Peptide. 

^ The construction vta fragments enables large mo- 
lecules to be made. Thus, human serum albumin (HSA) 
fft Is produced by E. coll transformed with plasmld pHSAI. 
*T Tht* Includes DNA made from fragments derived from 
V reverse transcription of mRNA from human liver. 

<0 

o 



ui 



ACTOR UM AC 



BEST AVAILS COPY 



" l " 0073646 

CONSTRUCTION OF DNA SEQUENCES AND TKEIR USE 
FOR MICROBIAL PRODUCTION OF PROTEINS, IN 
PARTICULAR HUMAN SERUM ALBUMIN 

5 This invention relates to recombinant DNA technology. 

It particularly relates to the application of the technology* - 
to the production of human serum albumin (HSA) in micro- 
organisms for use in the therapeutic treatment of humans. 
In one aspect the invention relates to a technique for 

10 producing DNA sequences encoding desired polypeptides. In 
another aspect it relates to the construction of microbial 
expression vehicles containing DNA sequences encoding a 
protein, e.g. human serum albumin or the biologically active 
component thereof operably linked to expression effecting 

15 promoter systems and to the expression vehicles so 

constructed. In another aspect, the present invention 
relates to microorganisms transformed with 
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such expression vehicles, thus directed in the expression of the OKA 
sequences referred to above. In yet other aspects, this invention 
relates to the means and methods of converting the end product of such 
expression to entities, such as phanaaceutical compositions, useful for 
5 the therapeutic treatment of humans. In preferred embodiments, this 

invention provides for particular expression vectors that are sequenced 
properly such that mature human serum albumin 1s produced directly. 

In one aspect, the present invention is particularly directed to a netted of 
10 preparing cDNA encoding polypeptides or biologically active portions 
thereof. This aspect provides the means and methods of utilizing 
synthetic primer OHA corresponding to a portion of the uftHA of the 
Intended polypeptide, adjacent to a known enoonuc lease restriction site, 
1n order to obtain by reverse transcription a series of DNA fragments 
15 encoding sequences of the polypeptide. These fragments are prepared such 
that the entire desired protein coding sequence 1s represented, the 
individual fragments containing overlapping DNA sequences harboring 
common endonuclease restriction sites within the corresponding 
overlapping sequence. This aspect facilitates the selective cleavage and 
20 ligation of the respective fragments so as to assemble the entire cDNA 
sequence encoding the polypeptide in proper reading frame. This 
discovery permits the obtention of cONA of high molecular weight proteins 
which otherwise may not be available through use of usual reverse 
transcriptase methods and/or chemical synthesis. 
25 * 

The publications and other materials hereof used to Illuminate the 
background of the Invention, and in particular cases, to provide 
additional details respecting Its practice are Incorporated herein by 
reference, and for convenience, are numerically referenced in the 
30 following text and grouped in the appended bibliography. 
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(A) Human Serum Albumin 

Human serum albumin (MSA) is the major protein species in adult 
serum. It is produced in the liver and is largely responsible 

5 for maintaining normal osmolality in the bloodstream and 

functions as a carrier for numerous serum molecules (1. 2). 
The apparent fetal counterpart of HSA Is o-fetoprotein and 
studies have been undertaken to compare the two as well as rat 
serum albumin and o-fetoprotein (3-8). The complete protein 

10 sequence of HSA has been published (9-12). The published 

protein sequences of HSA disagree 1n about 20 residues as well 
as in the total number of amino acids In the mature protein 
[584 amino acids (9); 585 (12)]. Some evidence suggests that 
HSA Is initially synthesized as a precursor molecule (13,14) 

15 containing a "prepro" sequence. The precursor forms of bovine 

(15) and rat (16) serum albumin have also been sequenced. 

The role or rationale for the use of albumin In therapeutic 
application is for the treatment of hypovolemia, 

20 hypoprote1nera1a and shock. Albumin currently is used to 

improve the plasma oncotic (colloid osmotic) pressure, caused 
by solutes (colloids) which are not able to pass through 
capillary pores. Inasmuch as albumin has a low permeability 
constant, It essentially confines itself to the intravascular 

25 compartment. When different concentrations of nondlff usable 

particles exist on opposite sides of the cell membrane, water 
crosses the partition until the concentrations of particles are 
equal on both sides. In this process of osmosis, albumin plays 
a vital role in maintaining the liquid content 1n blood. 
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Thus, the therapeutic benefits of albumin administration reside 
primarily for the treatraent of conditions where there is a loss 
of liquid from the intravascular compartment, such as in 
surgical operations, shock, burns, and hypoproteinemia 
resulting in edema. Albumin is also used for diagnostic 
applications in which its nonspecific ability to bind to other 
proteins makes it useful in various diagnostic solutions. 

Presently, human serum albumin is produced from whole blood 
fractionation techniques, and thus is not available in large 
amounts at competitive costs. The application of recombinant 
DMA technology makes possible the production of copious amounts 
of hunian serum albumin by use of genetically directing 
microorganisms to produce it efficiently. The present 
invention may enable the availability of purified HSA 
produced through recombinant DNA technology more abundantly and 
at lower cost than Is now presently possible* The present 
invention also provides knowledge of the DNA sequence 
organization of human serum albumin and its deduced amino acid 
sequence, helping to elucidate the evolutionary, regulatory, 
and functional properties of human serum albumin as well as its 
related proteins such as alpha-fetoprotein. 

More particularly, present invention provides for the isolation 
of cDNA clones spanning the entire sequence of the protein 
coding and 3* untranslated portions of HSA nANA. These cDNA 
clones were used to construct a recombinant expression vehicle 
which directed the expression in a microorgenism strain of the 
mature HSA protein under control of the trp promoter. The 
present invention also provides the complete nucleotide and 
deduced amino acid sequence of HSA. 
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, Reference herein to the expression of "mature human serum 

albumin" connotes the microbial production of human serum 
albumin unaccompanied by the presequence fprepro") that 
immediately attends translation of the human serum albumin 
5 mRNA. Mature human serum albumin, according to the present 

invention, is immediately expressed from a translation start 
signal (ATG), which also encodes the amino acid methionine 

linked to the first amino acid of albumin. This methionine ^ 

amino acid can be naturally cleaved by the microorganism so as 
10 to prepare the human serum albumin directly. Nature human 

serum albumin can be expressed together with a conjugated 

protein other than the conventional leader, the conjugate being 

specifically cleavable in an Intra- or extracellular 

environment. See British patent publication number 200767 6A. 
15 Finally, the mature human serum albumin can be produced in 

conjunction with a microbial signal polypeptide which 

transports the conjugate to the cell wall, where the signal is 

processed away and the mature human serum albumin secreted. 

20 (0) Recombinant 0NA Technology 



With the advent of recombinant DNA technology, the controlled 
microbial production of an enormous variety of useful 
polypeptides has become possible. Many mammalian polypeptides, 
25 such as human growth hormone and human and hybrid leukocyte 

interferons, have already been produced by various 
microorganisms. The power of the technology admits the 
microbial production of an enormous variety of useful 
polypeptides, putting within reach the microbial ly directed 
• 30 manufacture of hormones, enzymes, antibodies, and vaccines 

useful for a variety of drug-targeting applications. 



A basic element of recombinant ON A technology is the plasraid, 
an extra chromosomal loop of double -stranded OKA found in 
bacteria oftentimes in multiple copies per cell. Included in 
the information encoded in the plasraid DMA is that required to 
reproduce the plasraid in daughter cells (i.e.. a "repllcon* or 
origin of replication) and ordinarily, one or more phenotypic 
selection characteristics, such as resistance to antibiotics, 
which permit clones of the host cell containing the plasnld of 
interest to be recognized and preferentially grown In selective 
media. The utility of bacterial plasmids lies 1n the fact that 
they can be specifically cleaved by one or another restriction 
endonuclease or "restriction enzyme*, each of which recognizes 
a different site on the plasmld DMA. Thereafter heterologous 
genes or gene fragments may be inserted Into the plasmld by 
endwise Joining at the cleavage site or at reconstructed ends 
adjacent to the cleavage site- (As used herein, the term 
"heterologous" refers to a gene not ordinarily found in, or a 
polypeptide sequence ordinarily not produced by, a given 
microorganism, whereas the term "homologous" refers to a gene 
or polypeptide which is found 1n, or produced by the 
corresponding wild-type microorganism.] Thus formed are 
so-called repllcable expression vehicles. 

ONA recombination is performed outside the microorganism, and 
the resulting "recombinant' repllcable expression vehicle, or 
plasmid, can be introduced into microorganisms by a process 
known as transformation and large quantities of the 
heterologous gene -containing recombinant vehicle obtained by 
growing the transformant. Moreover, where the gene is properly 
inserted with reference to portions of the plasmid which govern 
the transcription and translation of the encoded DMA message, 
the resulting expression vehicle can be used to actually 
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produce the polypeptide sequence for which the inserted gene 
codes, a process referred to as expression. 

Expression Is initiated in a ONA region known as the promoter. 

S In the transcription phase of expression, the DMA unwinds, 

exposing the sense coding strand of the DMA as a template for 
initiated synthesis of messenger RNA from the 5 1 to 3' end of 
the entire DNA sequence. The messenger RNA is, In turn, bound 
by ribosomes. where the messenger RNA 1s translated Into a 

10 polypeptide chain having the amino acid sequence for which the 
ONA codes. Each amino add Is encoded by a nucleotide triplet 
or "codon" which collectively make up the "structural gene", 
i.e., that part of the DNA sequence which encodes the amino 
acid sequence of the expressed polypeptide product. 

15 

Translation is initiated at a "start" signal (ordinarily ATG, 
which in the resulting messenger RNA becomes AUG). So-called 
stop codons, transcribed at the end of the structural gene, 
signal the end of translation and, hence, the production of 
20 further amino acid units. The resulting product may be 

obtained by lysing the host cell and recovering the product by 
appropriate purification from other proteins. 

in practice, the use of recombinant ONA technology can express 
2 5 entirely heterologous polypeptides - so-called direct 

expression - or alternatively may express a heterologous 
polypeptide, fused to a portion of the amino acid sequence of a 
homologous polypeptide. In the latter cases, the intended 
bloactive product 1s rendered bloinactlve within the fused, 
30 homologous/heterologous polypeptide until It Is cleaved in an 
extracellular environment. See Wetzel, American Scientist 68, 
664 (1980). 
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If recombinant UNA technology is to fully sustain its promise, 
systems oust be devised which optimize expression of gene 
inserts, so that the intended polypeptide products can be cade 
available in controlled environments and in high yields, 

5 

(C) State of the Art 

Sargent et al.. in Proc. Mat!. Acad. Sci . (USA) 7B. 243 (1981), 
describe the cloning of rat serum albumin messenger RNA in a 
10 series of recombinant DMA pi a son ds. This was done to determine 

the nucleotide sequences of the clones In order to study the 
evolutionary hypothesis of the protein product. Thus, these 
workers made no attempt to assemble the cOHA fragments they 
prepared* 

15 

In Journal of Supramolecular Structure and Cellular 
Biochemistry. Supplement 5, 1981, Alan R. Liss, Inc. MY, 
Dugalciyk et aK report, in abstract form, their studies of the 
human gene for human serum albumin. They obtained cOHA 
20 fragments but there is no evidence that 'these workers cloned or 

produced the fragments for any purpose other than for studying 
the basic molecular biology of the ©-fetoprotein and serum 
albumin genes. 
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The present invention is based upon the discovery that recombinant DMA 
technology can be used to successfully and efficiently produce human 
serum albumin 1n direct form. The product 1s suitable for use In 
30 therapeutic treatment of human beings In need of supplementation of 

albumin. The product is produced by genetically directed microorganisms 
and thus the potential exists to prepare and isolate HSA in a more 
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efficient manner than is presently possible by blood fractionation 
techniques. It is noteworthy that we have 
succeeded in of genetically directing a microorganism to produce a 
protein of enormous length — 584 amino acids corresponding to an mRNA 
5 transcript upwards of about 2.&0O bases. 

The present invention comprises the human serum albumin thus produced and 
the means and methods of Its production. The present invention is 
further directed to repHcable DMA expression vehicles harboring gene 
10 sequences encoding HSA In directly expressible form. Further, the 

present invention is directed to microorganism strains transformed with 
the expression vehicles described above and to microbial cultures of such 
transformed strains, capable of producing HSA. Instill further aspects, 
the present invention is directed to various processes useful for 

15 preparing said HSA gene sequences, ONA expression vehicles, microorganism 
strains and cultures and to specific embodiments thereof. Still further, 
this invention 1s directed to the preparation of cONA sequences encoding 
polypeptides which are heterologous to the microorganism host, such as 
human serum albumin, utilizing synthetic DMA primer sequences 

20 corresponding in sequence to regions adjacent to known restriction 
endonuclease sites, such that Individual fragments of cONA can be 
prepared which overlap in the regions encoding the common restriction 
endonuclease sites. This embodiment enables the precise cleavage and 
ligation of the fragments so as to prepare the properly encoded ONA 

25 sequence for the intended polypeptide. 



The work described herein involved the expression of human serum albumin 
30 (HSA) as a representative polypeptide which 1s heterologous to the 

microorganism employed as host. Likewise the work described involved use 
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of the micro organism E. coli K-12 strain 294 tend A, thi~, hsr~, 
^hsm*), as described in Oritish Patent Publication Ho. 2055382 A. 
This strain has been deposited with the American Type Culture Collection, 
ATCC Accession No. 31446. 

5 

The invention, in its most preferred embodiments, is described with 
reference to £. coli . including not only strain t. coli strain 294, 
defined above, but also other known E. coli strains such as E. coli B. 
£• coli x 1776 and E. coli W 3110, or other microbial strains many of 

10 which are deposited and {potentially) available from recognized 

microorganism depository institutions, such as the American Type Culture 
Collection (ATCC)--cf. the ATCC -catalogue .listing. See also German 
Offenlegungsschrift 2644432. These other microorganisms include, for 
example. Bacilli such as Bacillus subtil 1s and other enterobacteriaceae 

15 among which can be mentioned as examples Salmonella typhimuriuro and 
Serratia marcesans. utilizing plasmids that can replicate and express 
heterologous gene sequences therein. Yeast, such as Saccharomyces 
cerevisiae . may also be employed to advantage as host organism in the 
preparation of the interferon proteins hereof by expression of genes 

20 coding therefor under the control of a yeast promoter. (See the 

copending U.S. patent application of Hitzeman et aK. filed February 25, 
1901 (Attorney Docket Ho. 100/43), assignee Genentech, Inc. et al. t or the 
corresponding European Application 82300949.3 which are 
incorporated herein by reference. 

25 Preferred embodiments of the invention will nov: be described 
with reference to the accompanying drawings in which: 

Figs. 1A and B are diagrams for use in explaining the 
construction of plasmid pHSAl; 

Fic. 2 shows the immunoprecipitation of bacterially 

30 synthesised HSA; and 

Fig. 3 shows the amino acid sequence of HSA and the 
corresponding ONA sequence. 
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In Fig.lA, the top line represents the mRNA coding for the human serum 
albumin protein and below It the regions contained in the cONA 
clones F-47, F-61 and 0-44 described further herein. The 
Initial and final amino acid codons of the mature HSA mRNA are 
indicated by circled 1 and 585 respectively. Restriction 
endonuclease sites involved in the construction of pHSAl are 
shown by vertical lines. An approximate sire scale 1n 
nucleotides 1s included. 

Tie conpleteo" plasmid pHSAl is shown in Fig. IB, with HSA coding regions 
derived from cONA clones shaded as 1n A). Selected restriction 
sites and terminal codons number 1 and 585 are Indicated as 
above. The E. coll trp promoter-operator region is shown with 
an arrow representing the direction of transcription. 6:C 
denotes an ollgo dG:dC tail. The leftmost Xbal site and the 
Initiation codon ATC were added synthetically. The 
tetracycline (Tc) and ampicillln (Ap) resistance genes In the 
pOR322 portion of pHSAl are indicated by a heavy line. 

Figure 2 depicts the immunoprecipl tatlon of bacterially synthesized MSA. 

E. coll cells transformed with albumin expression plasmid pHSAl 
{lanes 4 and 5) or control plasmid plelFA25 (containing an 
Interferon o gene in the Identical expression vehicle; lanes 2. 3 
and 7) were grown in 35 S-neth1on1ne-supplemente<J media. Samples 
In lanes 2, 4 and 7 were induced for expression from the trp 
promoter In H9 media lacking tryptophan; samples in lanes 3 and 5 
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were grown in tryptophan-containing 10 broth to repress the trp 

promoter. Each sample lane of the autoradiograph of the 

SOS-poly acryl ami de gel presented here contains labeled protein 

imaunopreci pita ted from 0.75 ml of cells at a density of A 55 0 °l- 

5 Lanes 1 and 6 contain radioactive protein standards (BRL) whose 

molecular weight in kilodaltons is indicated at the left. 

Bacterially synthesized HSA is seen in lane A comigrating with the 
m 

68,000 d C-labeled bovine serum albtiroin standards. Increased 
production of serum albumin in the induced versus repressed culture 

10 of pHSAl represents higher levels of synthesis of plasmid encoded 

protein rather than a difference in ^-methionine pool specific 
activities for minimal versus rich media (data not shown). The 
sharp band at 60,000 d 1s an apparent artifact; this band is seen in 
both Induced and repressed pHSAl and control transfonaants, and 

15 binds to preimmune (lane 7) as well as anti-HSA IgGs (lanes 2-5). 

The minor 47,000 d band in lane 4 is apparently plasmid encoded and 
may represent a prematurely terminated form of bacterially 
synthesized HSA. 

20 Figure 3 depicts the nucleotide and amino acid sequence of human serum 
albumin. 

The 0HA sequence of the mature protein coding and 3' untranslated 
regions of HSA mRNA were determined from the recombinant plasmid 

25 pHSAl. TheOMA sequence of the prepro peptide coding and V 

untranslated regions were determined from the plasmid P-14 (see 
text). Predicted amino acids are included above the DNA sequence 
and are numbered from the first residue of the mature protein. The 
preceding 24 amino acids comprise the prepro peptide. The five 

30 amino acid residues which disagree with the protein sequence of HSA 

reported by both Dayhoff (9) and Houlon et al_. (12) are underlined. 
The above nucleotide sequence probably does not extend to the true 
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s' terminus of HSA mRNA. In the albumin direct expression plasmid 
pHSAl, the mature protein coding region is immediately preceded by 
the C. coll trp promoter-operator-leader peptide ribosome binding 
site (36, 37} , an artificial Xbal site, and an artificial inltation 
5 codon ATC; the prepro region has been excised. The nucleotides 

preceding HSA codon no. 1 1n pHSAl read 5 1 -TCACGTAAAAAGGGTATCTAGATG. 

Detailed Description 

10 (A) Synthesis and Cloning of cDHA . Poly(A)* RNA was prepared from 
quickly frozen human liver samples obtained from biopsy or from 
cadaver donors by either ribonucleoslde-vanadyl complex (17) or 
guanldlnlum thlocyanate (18) procedures. cOHA reactions were 
performed essentially as described 1n (19) employing as primers 
15 either ol igo-deoxynucleotides prepared by the phosphotiiester 

method (20) or oligo (dT) 12 _ 18 (Collaborative Research). For 
typical cOMA reactions 25-35 v g of poly(A)+ RNA and 40-80 piaol 
of oligonucleotide primer were heated at 90* for 5 minutes In 
50 mM NaCl. The reaction mixture was brought to final 
20 concentrations of 20 raM Trls HC1 pH 8.3, 20 mM KC1, 8 mM 

NgCl 2 , 30 mM dithi othreltol , 1 mM dATP, dCTP, dGTP, dTTP 
(plus 32 P-dCTP (Amersham) to follow recovery of product) and 
allowed to anneal at 42*C for 5'. 100 units of AMY reverse 
transcriptase (BRL) were added and Incubation continued at 42* 
25 for 45 minutes. Second strand OKA synthesis, SI treatment, 

size selection on pel y aery 1 amide gels, deoxy (C) tailing and 
annealing to p6R322 which was cleaved with Pstl and deoxy (G) 
tailed, were performed as previously described (21, 22). The 
annealed mixture was used to transform E. co)1 K-12 strain 294 
30 (23) by a published procedure (24). 
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(0) Screening of Recombinant Plasmids with 32 P-labelled Probes. 

£. coli transf oraants were grown on LB-egar plates containing 
5gg/ml tetracycline, transferred to nitrocellulose filter paper 
s (Schleicher and Schuell, BA05) and tested by hybridization 

using a modification of the In situ colony screening procedure 
(25). 32 P-end labelled (26) oligodeoxynucleotide fragments 
of from 12 to 16 nucleotides In length were used as direct 

32 

hybridization probes, or •""P-cDMA probes were synthesized 
10 from RHA using ollgo(dT) or oligodeoxynucleotide primers (19). 

Filters were hybridized overnight In 5X Denhardt's solution 
(27), SxSSC, UxSSC*1.5M NaCl. 0.15(4 Na Citrate) 50 mM Na 
phosphate pH 6.8. 20 wg/nl salmon sperm DNA at temperatures 
ranging from 4* to 42" and washed 1n salt concentration's 
15 varying from 1 to 0.2x5SC plus 0.1 percent SOS at temperatures 

ranging from 4* to 42* depending on the length of the 
32 P-1abelled probe (28). Dried filters were exposed to Kodak 
XR-2 X-ray film using DuPont Lightning-Plus Intensifying 
screens at -80*. 

20 

(C) DNA Preparation and Restriction Enzyme Analysis . Plesmid DNA 
was prepared in either large scale (29) or small scale 
Prainiprep"; 30) quantities and cleaved by restriction 
endonucleases (New England Biolabs, BRL) following 
25 manufacturers conditions. Slab gel electrophoresis conditions 

and electroelution of DNA fragments from gels have been 
described (31). 
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(D) DMA Sequencing . DNA sequencing was accomplished by both the 
method of Haxam and Gilbert (26) utilizing end-labelled DNA 
fragments and by dideoxy chain termination (32) on single 
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strantfed ONA from phage M3 nP7 subclones (33) utilizing 
synthetic oligonucleotide (20) primers. Each region was 
independently sequenced several times. 

S (£) Construction of S' End of Albumin Gene for Oirect Expression of 
HSA. 10 yg (-16 pmol) of the .1200 hp PstI Insert of 
plasraid F-47 was boiled in H 2 0 for 5 minutes and combined 
with 100 pool of 32 P-end labelled 5* primer 
(dATGGATCCACACAAG). The mixture was quenched on 1ce and 
10 brought to a final volume of 120 pi of 6 rtl Trls HC1 pH 7.5, 6 

nf4 MgC1 2 , 60 mM NaCl, 0.5 crt dATP, dCTP, dGTP, dTTP at 0\ 
10 units of DNA polymerase 1 Klenow fragment (Boehringer- 
Mannhelm) were added and the mixture Incubated at 24* for 5 
nr. Following phenol /chloroform extraction, the product was 
IS digested with Hpall, electrophoresed 1n a 5 percent 

polyacrylamlde gel, and the desired 450 bp fragment 
electrocuted. The single stranded overhang produced by Xbal 
digestion of the vector plasmld pLelF A25 (21) was filled In to 
produce blunt DNA ends by adding deoxynucleoslde triphosphates 
20 to 10 pM and 10 units DNA polymerase 1 Klenow fragment to the 

restriction endonuclease reaction mix and incubating at 12* for 
10 minutes. Restriction endonuclease fragments (0.1-1 yg In 
approximate molar equality) were annealed and Ugated overnight 
at 12* 1n 20 pi of 50 mH Tris HC1 pH 7.6, 10 mN HgCl 2 , 0.1 bM 
25 "EDTA, 5 nM dl thlothreltol, 1 nf4 rATP with 50 units T4 Hgase 

(N.E. Blolabs). Further details of plasmld construction are 
discussed below. 

if) Protein Analysis . Two ml cultures of recombinant E. coll 
30 strains were grown in either LB or H9 media plus 5 M g/ml 

tetracycline to densities of Aj^ ■ 1.0, pelleted, washed, 
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repelleted, and suspended In 2 ml of LB or supplemented M9 (H9 
♦ 0.2 percent glucose, 1 ug/ml thianine, 20 pg/ral standard 
amino acids except methionine was 2 ug/ml and tryptophan was 
excluded). Each growth medium also contained 5 ug/ml 
tetracycline and 100 M Ci 35 5-nethionine (HEH; 1200 Ci/rarool). 
After 1 hr incubation at 37*, bacteria were pelleted, freeze- 
thawed and resuspended in 200 pi 50 oM Tris HC1 pH 7.5, 0.12 nt\ 
NaEOTA then placed on ice for 10 minutes following subsequent 
additions of lysozyme to 1 rag/ml, NP40 to 0.2 percent, and HaCI 
to 0.3S N. The lysate was adjusted to 10 mtt MgCl 2 and 
incubated with £0 ug/ml DHase I (Worthf ngton) on ice for 30 
min. Insoluble material was removed by mild centrifugation. 
Samples were innunoprecipitated with rabbit anti-HSA (Cappel 
Labs) and staphylococcal absorbent (Pansorbin; Cal Biochem) as 
described (34) v and subjected to SOS poly acryl amide gel 
electrophoresis (35). 

cDMA Cloning . Initial .cDHA clones primed with oligo (dT) were 
screened by colony hybridization with both total liver cONA (to 
identify abundant RHA species containing clones) and with two 

3? 

P-labelled cDMAs primed from liver mRNA by two sets of four 
11 base oligodeoxy nucleotides synthesized to represent the 
possible coding variations for amino acids 546-549 and 294-297 
of HSA. Positive colonies never contained more than about the 
3' 1/2 of the protein coding region of the expected HSA mRNA 
sequence. (The longest of these recombinants was designated 
B-44.) Since existing procedures were unable to directly copy 
an mRNA of the expected size (-2000 bp), synthetic 
oligodeoxynucleotides were prepared to correspond to the 
antlmessage strand at regions near the 5* extreme of B-44. 
From the nucleotide sequence of B-44, we constructed a 12 base 



0073646 

-17- 

oligodeoxynucleotide corresponding to amino acids 369-373. 
This was used to prime cONA synthesis of liver nWW and produce 
cOMA clones in pBR322 containing the 5* portion of the HSA 
message while overlapping the existing 0-44 recombinant. 

5 Approximately 400 resulting clones were screened by colony 
hybridization with a 16 base oligodeoxynucleotide fragment 
located slightly upstream in the mflNA sequence we had thus far 
determined. Approximately 40 percent of the colonies 
hybridized to both probes. Many of those colonies which failed 

10 to contain hybridizing plasmids presumably resulted from RHA 
self-priming or priming with contaminating oligo <dT) during 
reverse transcription, or lost the 3' region containing the 
sequence used for screening. "Mlniprep - amounts of plasmid DHA 
from hybridizing colonies were digested with Pstl . Three 

15 recombinant plasmids contained sufficiently large Inserts to 
code for the remaining 5* portion of the HSA message. Two of 
these (F-15 and F-47) contained the extreme 5' coding portion 
of the gene but failed to extend back to a PstI site necessary 
for joining with B-44 to reform the complete albumin gene. 

20 Recombinant F-61 possessed this site but lacked the 5' extreme 
end. A three part reconstruction of the entire message 
sequence was possible employing restriction endonuclease sites 
in conraon with the part length clones F-47, F-61 and B-44 
(Fig. 1). 

25 

An additional cOMA clone extending further 5* was obtained by 
similar oligodeoxynucleotide primed cOKA synthesis (from a 
primer corresponding to amino acid codons no. 175-179). 
Although not employed in the construction of the mature KSA 
30 expression plasmid. this cONA clone (P-14) allowed 

determination of the DMA sequence of the "prepro - peptide 
coding and 5' non-coding regions of the HSA nftNA. 



0073646 

-18- 

The aature HSA raRNA sequence was joined to a vector plasraid for 
direct expression of the mature protein in £. coli via the trp 
promoter-operator. The plasmid pLelF A25 directs the 
expression of human leukocyte interferon A (IFNa2) (21). It 
was digested with Xbal and the cleavage site Tilled in" to 
produce blunt DMA ends with DNA polymerase I Klenow fragment 
and deoxynucleoside triphosphates. After subsequent digestion 
with Pstl . a "vector" fragment was gel purified that contained 
pBR322 sequences and a 300 bp fragment of the E. coli trp 
promoter, operator, and nbosome binding site of the trp leader 
peptide terminating In the artificially blunt ended Xbal 
cleavage site. A 15 base ollgodeoxy nucleotide was designed to 
contain the Initiation codon ATC followed by the 12 nucleotides 
coding for the first four amino acids of HSA as determined by 
DNA sequence analysis of clone F-47. In a process referred to 
as "primer repair", the gene-containing Pstl fragment of F-47 
was denatured, annealed with excess 15-mer and reacted with DNA 
polymerase I Klenow fragment and deoxynucleoside triphosphates. 
This reaction extends a new second strand downstream from the 
annealed oligonucleotide, degrades the single stranded DMA 
upstream of codon number one and then polymerizes upstream 
three nucleotides complementary to ATC. In addition, when this 
product is blunt-end li gated to the prepared vector fragment, 
its initial adenosine residue recreates en Xbal restriction 
site, following the primer repair reaction, the DMA was 
digested with Hpali and a 450 bp fragment containing the 5* 
portion of the mature albumin gene was gel purified (see Fig. 
1). This fragment was annealed and 1 igated to the vector 
fragment and to the gel isolated Hpall to Pstl portion of F-47 
and used to transform E. coli cells. Diagnostic restriction 
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cndonuclease digests of plasmid minipreps identified the 
recombinant A-2G which contained the 5* portion of the mature 
albumin coding region ligated properly to the trp promoter- 
operator. For the final steps In assembly, the A-36 plasmid 
was digested with Sglll plus Pstl and the -4 kb fragment was 
gel purified. This was annealed and ligated to a 390 bp Pstl, 
Oglll partial digestion fragment purified from F-61 and a 1000 
bp Pstl fragment of 8-44. Restriction endonuclease analysis of 
resulting transfonaants Identified plasmids containing the 
entire HSA coding sequence properly aligned for direct 
expression of the mature protein. One such recombinant plasmid 
was designated pHSAl. When E « epVj containing pHSAl is grown 
in minimal media lacking tryptophan, the cells produce a 
protein which specifically reacts with HSA antibodies and 
coaigrates with HSA in SOS polyacryl amide electrophoresis (Fig. 
2). No such protein is produced by identical recombinants 
grown in rich broth, implying that production 1n £. coli of the 
putative HSA protein Is under control of the trp 
promoter •operator as designed. To insure the integrity of the 
HSA structural gene in the recombinant plasmid, pHSAl was 
subject to OHA sequence analysis. 

OKA Sequence Analysis 

The albumin cDNA portion (and surrounding regions) of pHSAi 
were sequenced to completion by both the chemical degradation 
method of Kexam end Gilbert (26] and the dldeoxy chain 
termination procedure employing templates derived from single 
stranded M13 raP 7 phage derivatives (32. 33). All nucleotides 
were sequenced at least twice. The ONA sequence Is shown In 
Fig. 3 along with the predicted amino acid sequence of the HSA 
protein. The OHA sequence farther 5' to the mature HSA coding 
region was also determined from the cOHA clone P-14 and is 
included in Fig. 3. 
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OJiA sequence analysis confirmed that the artifical initiation 
codon and the complete nature HSA coding sequence directly 
follows the E. coli trp promoter- operator as desired. The ATG 
initiator follows the putative £. coli ribosome binding 
5 sequence (36) of the trp leader peptide <37) by 9 nucleotide's. 

Translation of the DNA sequence of pHSAl predicts a mature HSA 
protein of 585 amino acids. Various published protein 
sequences of HSA disagree at about 20 amino acids. The present 

10 sequence differs by eleven residues from Houl on et aU (12), 

and by 28 residues from that reported in the Dayhoff catalogue 
(9) credited as arising primarily from Behrens et aK (10) with 
contributions by Moulon et al. (12). Host of these differences 
represent inversions of pairs of adjacent residues or 

15 glutamine-glutamic acid disagreements. Only at five of the 585 

residues does our sequence differ from the residue reported by 
both Dayhoff (9) and ftoulon et al^ i\Z) t and three of these 
five differences represent glutamine-glutamic acid interchanges 
(underlined in Figure 3). At all discrepant positions the 

20 nucleotide sequencing has been carefully rechecked and it is 

unlikely that DNA sequencing errors are the cause of these 
reported differences. The possibility of artifacts introduced 
by cONA cloning cannot be ruled out. However, other likely 
explanations exist for the amino acid sequence differences 

25 among various reports. These Include changes in amidation 

(affecting glutamine-glutamic acid -discrimination) occurring 
either in vivo or during protein sequencing (38). Polymorphism 
in HSA proteins may also account for some differences; over 
twenty genetic variants of HSA have been detected by protein 

30 electrophoresis (39) but have not yet been analyzed at the 
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anino acid sequence level. It is also worth noting that our 
predicted MSA protein sequence is 585 amino acids long, in 
agreement with Moulon (121 but not Dayhoff (9). The difference 
1s accounted for by the deletion (in ref. 9) of one 
phenylalanine <Phe) residue in a Phe-Phe pair at amino acids 
156-157. 

When compared to the DNA sequence of a rat serum albumin cDHA 
done (16) the present nature HSA sequence shares 74 percent 
homology at the nucleotide and 73 percent homology at the aoino 
acid level. (The rat SA protein 1s one amino acid shorter than 
HSA; the carbo^y terminal residue of HSA Is absent in the rat 
protein.) All 35 cysteine residues are located In identical 
positions in both proteins. The predicted "prepro" peptide 
region of HSA shares 76 percent nucleotide and 75 percent amino 
add homology with that reported from the rat cONA clone (16). 
Interspecies sequence homology is reduced 1n the portion of the 
3' untranslated region which can be compared (the published rat 
cOHA clone ends before the 3' mRNA terminus). The HSA cONA 
contains the hexanucleotlde AATAAA 28 nucleotides before the 
site of poly(A) addition. This Is a common feature of 
eukaryotlc mRNAs first noted by' Proud foot and Brownlee (40). 

Pharmaceutical Compositions 

The compounds of the present invention can be formulated according to 
known methods to prepare pharmaceutical ly useful compositions, whereby 
the polypeptide hereof is combined 1n admixture with a pharmaceutical ly 
acceptable carrier vehicle. Suitable vehicles and their formulation are 
described 1n Remington's Pharmaceutical Sciences by E.W. Martin, which is 
hereby Incorporated oy reference. Such compositions will contain an 
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effective amount of the protein hereof together with a suitable amount of 
vehicle in order to prepare pharraaceu tic ally acceptable compositions 
suitable for effective administration to the host. One preferred mode of 
administration is parenteral. 
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CLAIMS : 

1- A method of constructing a DNA sequence encoding a 

polypeptide comprising a functional protein or a bioactive 
5 portion thereof, said DNA sequence being designed for 

insertion together with appropriately positioned transla- 
tional start and stop signals into an expression vector 
under the control of a microbially operable promoter, 
comprising the steps of: 

providing messenger RNA comprising the entire 
coding sequence of said polypeptide, 
obtaining by reverse transcription from the 
messenger RNA of step (a) a series of fragments of 
double stranded cDNA, each of said fragments 
corresponding in sequence to a portion of said 
coding sequence and thus encoding a portion of 
said polypeptide, wherein said fragments overlap 
in sequence at the respective terminal regions 
thereof, the overlapping portions thereof 
containing common restriction endonuclease sites, 
said fragments in totality compri-sing the entire 
coding sequence of said polypeptide, 
cleaving the fragments of step (b) so as to 
prepare corresponding fragments which, when 
properly ligated, encode said polypeptide, and 
ligating the fragments obtained from step <c) . 



10 (a) 
(b) 



15 



20 



(c) 



25 



(d) 
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2. A -method of constructing a vector for use in 
expressing a polypeptide comprising performing the method of 
claim 1 *o produce a product comprising the entire -coding 
sequence of said polypeptide, and introducing the product 
into a vector under proper reading frame control of an 
expression promoter. 

3. The method according to claim 1 or 2 wherein said 
polypeptide comprises the amino acid sequence of human serum 
albumin. 

A . The method according to claim 3 wherein the poly- 

peptide contains a cleavable conjugate or microbial signal 
protein attached to the N-terminus of the ordinarily first 
amino acid of said human serum albumin. 

5. The method according to claim 4 wherein said cleavable 
conjugate is the amino acid methionine. 

6. A method according to any preceding claim wherein said 
DNA sequence is the gene encoding human serum albumin. 

7. A -DNA sequence consisting essentially of a sequence 
encoding human serum albumin. 

8. A ONA sequence according to claim 7 operably linked 
with a DNA vector capable of effecting the microbial 



-28- 



0073646 



expression of said sequence so as to prepare the corres- 
ponding human serum albumin. 

9- A replicable microbial expression vehicle capable, in 

a transformant microorganism, of expressing the DNA sequence 
according to claim 7. 

10. A microorganism transformed with the vehicle according 
to claim 9 . 

11. A fermentation culture comprising a transformed 
microorganism according to claim 10, 

12. The microorganism according' to claim 10 f obtained by 
transforming an E. coli bacterial or a yeast strain. 

13. The plasmid pHSAl. 

14. An E . coli bacterial strain transformed with the 
plasmid according to claim 13. 

15. A process which comprises microbially expressing human 
, serum albumin in mature form. 

16. The use of human serum albumin prepared by the process 
of ^claim 15 for therapeutic treatment of humans or for 
preparing pharmaceutical compositions useful for such 
treatment. 
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AC6ATCTCTTCT6fiCAATTTCATATAAC1 ATTTTTTCAAAAAT6TC TC TTC TCTtAACC CCAC6CCTTTGGC 

(Prep re I 

%t lys Trp Val Thr Pha tit Sar L« its Pht Uu Pht Str Sir All Tyr tor Arg Cly Val Pt» Are Arg 
ACA ATS AM T66 6TA ACC TTI ATT TCC CTTCTT TTT CTC TTT AGC TC6 GCT TAT TCC AGG 66T 6T6 TTT CGT CCA 

Ass Alt Nil lys Str flo Val Alt His Arg Pht ly* **P Lto fly flu 61a As* Phe lyi Alt leu Val Ltu He 
GAT GCA CAC AA6 A6T GAS GTT GCT CAT CSC TTT AAA GAT TTC CGA GAA GAA AAT TTC AAA CCC TTG GTS TTC ATT 

$0 

AU Phe AH 61a Tyr ltu Sin 61a Cys Pro Pha 61 ■ An His Val Lyi Law Vil Asa 61u Vtl Thr 61m Phi All 
6CC TTT CCT CA6 TAT CTT CAfi CA6 T6T CCA TTT 6AA CAT CAT 6TA AAA TTA 6TB AAT 6AA GTA ACT 6AA TTT OCA 

lys Thr Cys Vil All Asp flu Sar Ala 61« Asn Cyl Asp lys Str leg Mil Thr Lto Phe 61y As lyi lev Cyl 
AAA ACA T6T 6TA 6CT CAT 6AC TCA 6CT 6AA AAT TCT 6AC AAA TCA CTT CAT ACC CTT TTT 66A GAC AAA TTA TCC 

100 

Thr Vil All Thr Ltu Are 61s Thr Tyr CTy flu Pht Ala Ad Cyt Cyt All lys Gin flu Pro flu Arg Asn GTa 
ACA 6TT CCA ACT CTT C6* 6AA ACC TAT 66T GAA AT6 6CT 6AC TCC TCT CCA AAA CAA GAA CCT GAG AGA AAT GAA 

Cys Pha lau GTa Mil lys Asp Asp Asn Pro Asa Uu Pro Arg Ltu Yil Arg Pre flu V«1 Asp Val Met Cys Thr 
TCC TTC TT6 CAA CAC AAA GAT GAC AAC CCA AAC CTC CCC CGA TTC 6TG AGA CCA GAG GTT GAT GTC ATC TCC ACT 

150 

All Pha Nfi Asp Asn flo flu Thr Pita Lau Lys Lys Tyr Ltu Tyr flu lit Ala Art Arg Hit Pre Tyr Phe Tyr 
GCT TTT CAT GAC AAT GAA GAC ACA TTT TTC AAA AAA TAC TTA TAT GAA ATT CCC AGA AGA CAT CCT TAC TTT TAT 

AU Pro 61b Laa Lau Pnt Pht Ala Lys Arg Tyr Lys Ala Ala Pha Thr flu Cys Cys Gin Ala Ala Asp Lys Ala 
GCC CCG GAA CTC CTT TTC TTT GCT AAA AGG TAT AAA GCT GCT TTT ACA GAA TCT TCC CAA GCT GCT GAT AAA GCT 

200 

Alt Cys Ltu Lau Pro Lys Lau Asp flu Lau Arg Ass flu fly Lys Ala Sar Sar Ala Lyi An Ara Ltu Lys Cys 
GCC TGC CTG TTG CCA AA6 CTC GAT GAA CTT CGC GAT GAA GGS AAG GCT TCC TCT GCC AAA CAfi AfiA CTC AAA TCT 

AH Str Lau 61 a Lyi Pha Cly flu Arg Ala Pea Lys Ala Trp Ala Val Ala Are Uu Jar 61 a Arg Pht Pro Lys 
GCC AST CTC CAA AAA TTT GfiA GAA AGA GCT TTC AAA GCA TGC CCA GTC GCT CGC CTC AGC CAfi AGA TTT CCC AAA 

250 

Ala flu Pha Alt Glu Vil Str lys Uu Til Thr Asp Ltu Thr* Lys Val His Thr flo Cyi Cyi His fly Asp Ltu 
GCT GAG TTT GCA GAA GTT TCC AAG TTA GTC ACA GAT CTT ACC AAA GTC CAC ACS GAA TCC TGC CAT GfiA GAT CTG 

lau flu Cys Alt Asp A* Arg Ala Asp Lau Ala Lys Tyr lit Cyi flu Aid Gin Asp Sar III Str Str Lys Ltu 
CTT GAA TCT GCT GAT GAC AGfi GOG GAC CTT GCC AA6 TAT ATC TCT GAA AAT CAB GAT TCC ATC TCC ACT AAA CTC 

300 

Lys flu Cys Cys Glu Lys Pro ltu Ltu flu Lys Str His Cyi lit Ala flu Val flu Asn Asp flu Urt Pro Ale 
AA6 CAA TGC TCT GAA AAA CCT CTG TTC GAA AAA TCC CAC TGC ATT GCC GAA GTC CAA AAT GAT GAG ATI CCT GCT 

Asp leu Pro Sar Lao Alt Ala Asp Pha Val 61 a Str Lys A so Val Cys Lyi Asn Tyr Alt flu Ala Lyi Asp Val 
GAC TTG CCT TCA TTA GCT GCT GAT TTT GTT GAA ACT AAC GAT GTT TGC AAA AAC TAT GCT GAG GCA AAG GAT GTC 

ISO 

Phc Lau 61; Mat Pt» lau Tyr flu Tyr Ala Arg Arg His Pro Asa Tyr Sar Vat Val Ltu Lau Ltu Arg Uu Ala 
TTC CTC GCC ATC TTT TTC TAT GAA TAT GCA AGA AGE CAT CCT GAT TAC TCT GTC CTG CTC CTC CTG AGA CTT GCC 

Lys Thr Tyr Glu Thr Thr Uu flu Lys Cys Cys Ala Ala Ala Asp Pre His Glu Cys Tyr Ala Lys Val Pha Asp 
AAG ACA TAT GAA ACC ACT CTA GA6 AAG TGC TCT GCC GCT GCA GAT CCT CAT GAA TGC TAT CCC AAA GTC TTC GAT 

400 

flu Pht Lys Pro Uo Val flu flu Pro Gin Asa Uu lit Lys Gin Asn Cys Glu Uu Pht Lys Gin Ltu fly flu 
GAA TTT AAA CCT CTT GTC GAA GAG CCT CAC AAT TTA ATC AAA CAA AAC TCT GAG CTT TTT AAG CAC CTT GfiA GAG 

Tyr Lys Pht Gin Asn Ala Uu Ltu Vil Arg Tyr Thr Lys Lys Val Pro Gin Val Str Thr Pro Thr Uu Val flu 
TAC AAA TTC CAfi AAT GC6 CTA TTA GTT CCT TAC ACC AAG AAA GTA CCC CAA GTC TCA ACT CCA ACT CTT GTA GAG 

450 

111 Str Arg Asn Lto fly Lyi Vil 61y Str Lys Cys Cys Lyi Hit Pro flu Ala Lys Arg Pht Pro Cys Ala flu 
GTC TCA AGA AAC CTA GGA AAA CTC GGC AGC AAA TCT TCT AAA CAT CCT GAA GCA AAA AfiA ATC CCC TCT GCA GAA 

Asp Tyr lau Scr Val Val Uu Asn 61a Uu Cyi Val Lau His flu Lys Thr Pro Val Sar Asp Arg Val Thr Lys 
GAC TAT CTA TCC CTG GTC CTC AAC CAfi TTA TCT GTC TTC CAT GAG AAA ACG CCA GTA ACT GAC AfiA GTC ACA AAA 

500 

Cys Cys Thr Glu Str Ltu Val Asn Ara Are Pro Cyi Pha Str Ala Lau flu Vil Asp flo Thr Tyr Ttl Pro Lys 
TGC TGC ACA GAG TCC TTC GTC AAC AGG CGA CCA TCC TTT TCA GCT CTC GAA GTC GAT GAA ACA TAC GTT CCC AAA 

GTu Ph. Asa Alt flu Thr Ph» Thr Pha Hit Ala Asp 11a Cyi Thr Ltu Str do Lys Glu Arg flo lit Lys Lys 
6AG TTT AAT GCT GAA ACA TTC ACC TTC CAT GCA GAT ATA TGC ACA CTT TCT GAG AAG GAG AGA CAA ATC AAG AAA 

ISO 

5!? T£ ~* L J£ v !l Hu 190 >a1 H1t Lys Pro Lys Ala Thr Lys flu Gin Ltu Lys Ala Val Pa* Asp Asp 
CAA ACT 6CA CTT CTT GAG CTT GTC AAA CAC AA6 CCC AAG GCA ACA AAA GA6 CAA CTC AAA CCT GTT ATC GAT GAT 

tJ? tit ~i !5l I! 1 tlH Ljfi C '« Cy * A, « *» **> *J» nu nr C *» P * A1 * S1 » M » 61y lys Lys Lau 
^ 1* 5*6 AAfi TGC TEC AAC 6CT GAC GAT AAG fiAG ACC TGC TTT GCC GAG GAG 66T AAA AAA CTT 

Val Alt Ala Sar Gla A1a Ala Uu Cly Uu tnd 

GTT GCT GCA ACT CAA GCT GCC TTA GGC TTA TAA CATC TACATTT AAAAGCATC TCAGCCTACCATCAGAATAAfiAfiAAAGAAAATCAA 

GATUAAASCnATTCATCTCTTTTCTTTTTt^ 

6TGCTTr^nAATAAAAAAT64AAAfiAATnAATA6A6TW 

TGGAAGTTCCASTCTTCTCTCnATTtCACTTE^ H\yU) 
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